Statistical Disclosure Control

Research on tabular data (WP 3)

Leading partner: StBA

Participating partners: StBA, TUIlm

Objectives

The work-package aims at providing methodology, expertise and software needed to reach the overall goal of the proposal as concerning tabular data, which is to create a suitable package to be established as standard tool for disclosure control of aggregated data. In terms of project management it will be the objective of this work-package to link the needs of the end-users to the software development as to ensure a wide range of usability and user-friendliness for the resulting software package.
Within WP 3 all tasks will be co-ordinated by StBa


Specific aims to address the overall goal of the work-package are the following:

Task 1

Refine and support the integration of the most desirable qualities and facilities of existing software systems for tabular data protection into τ-ARGUS. Software concept and design: Develop, propose and/or refine software concepts, design of user-interfaces, supply methodological expertise (e.g. formulas, concepts and ideas taken e.g. from other cell suppression packages). Work will be carried out by the co-ordinator of WP 3 StBa in close co-operation with the co-ordinator (CBS) of WP 4.2 Concept and design for a user-friendly cell suppression software with a wide range of applicability in various situations, in particular to linked multiple and hierarchical tables with special features to synchronise suppression patterns between tables which are already published, and those which are intended to be newly released (e.g. customised tables).


Task 2

Support the integration of the most recent version of the GHQUAR suppression algorithm (c.f. sec. 5 (Innovation)), which will ensure wide applicability of the package to (linked) tables of any size and complexity of structure. Supply GHQUAR: support the implementation of the most recent version of GHQUAR (applicable to linked tables and automated weighting functionality for control of the selection of secondary suppressions); supply of expertise on this software, minor modifications of the GHQUAR-software, in case they turn out to be necessary. Work will be carried out by a subcontractor, e.g. the developer of the GHQUAR software. Integration of a software suitable for selection of secondary suppressions in multiple tables of any size and complexity of structure.


Task 3

Ensure practical significance of the research on optimisation problems arising in the development of cell suppression algorithms based on linear programming to be carried out as one of the tasks of WP 4.1. Support for the research on optimisation problems: A library of ‘close-to-real-life’ test instances shall be developed and supplied to the OR-researchers involved in the project (c.f. WP 4.1). The problem set up of the optimisation problems (c.f. WP 4.1) will be checked with assistance of an independent expert (i.e. an expert not involved in the development of linear programming methodology for cell suppression within this project) as well as the research progress. Work will be carried out by the co-ordinator of WP 3 (StBa) in close co-operation with a subcontractor (‘independent’ OR expert) and with all partners involved in WP 4.1. Close collaboration providing assistance and feedback to the research partners involved in the development of algorithms suitable for selection of secondary suppressions in moderate sized hierarchical tables will reduce the risk that considerable amounts of research are spend on problems beyond practical significance. This will yield software keeping a good balance between quality (of the resulting suppression patterns in terms of information loss due to suppression) and quantity (e.g. size of the tables, that the software can be applied to efficiently


Task 4

Provide information on the performance of the various algorithms for secondary cell suppression to be included in the final package. This information shall support the transfer of the package and will as well be useful for guiding internal decisions to be made within the software implementation work package WP 4.2. Benchmarking: Any of the algorithms for selection of secondary suppressions implemented into the package will be run on the set of tables from the test library (see (3) above). Performances with respect to certain key issues (information loss in terms of number and/or total value of suppressions, etc., computing time requirement) will be recorded. Work will be carried out by a subcontractor (University OR-department).


Task 5

Maximise the information content of tables with suppressed entries, e.g. after a suppression procedure has been carried out. Table perturbation techniques: Development and implementation of algorithms to calculate the upper and lower values which any suppressed value could have without violating the constraints implied by the additive relationships within the table, perturbed values to replace suppressed original cell entries. The perturbed values will be located between the upper and lower bounds (s.a.), matching submarginals and marginals, thus implying that table additivity will be maintained. Work will be carried out by a subcontractor (University OR-department). The package will be able to calculate lower and upper bounds for suppressed entries, that can be released safely, thus reducing the loss of information to the amount needed to protect the sensitive cells. Instead of being presented suppressed entries or lower and upper bounds for them, for many purposes data users may prefer a single value to replace the suppressed entry: The software will offer perturbed values of the original cell entries, which will maintain table additivity, and will not disclose individual information either.


Task 6

Gain expertise with newly implemented facilities for control of the selection of secondary suppressions. In particular the ‘European dimension’ of the secondary cell suppression problem shall be addressed, e.g. how to ease and sustain approaches of co-ordinating suppression patterns within Europe, as suggested e.g. by Eurostat for application to data of the structural business survey (Doc. Eurostat/D2/SBS-T/NOV99/03).

Co-ordination of suppression patterns as a special application: Specific problems arise when data are published on different levels of a regional classification (e.g. on the national and on the supernational (EU) level, or on the regional and national level) but secondary suppressions are to be assigned by different agencies actually (e.g. NSI’s and Eurostat, or regional and national statistical institutes). This problem, due to decentralised organisation of official statistics within Europe, will be tackled using facilities of the software as implemented so far. Feasibility of several approaches to improve the situation will be researched ( in particular the approach suggested by Eurostat (c.f. (6) in the objectives section above) with a particular view on the practicability of any methods suggested. The methods will be applied to several real-life datasets, available on national as well as regional level. Methods turning out to be promising will be supported by the software package, e.g. special options shall be included in the software if necessary so. Work will be carried out by the co-ordinator of WP 3 (StBa)

Expertise with newly implemented facilities for control of the selection of secondary suppressions will be acquired and will be used to give assistance to users with special needs, ensuring particularly the applicability of the package to Eurostat to the extend possible for co-ordination of national suppression patterns.